Goto

Collaborating Authors

 Quetta Division


Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences

El, Batu, Zou, James

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly shaping how information is created and disseminated, from companies using them to craft persuasive advertisements, to election campaigns optimizing messaging to gain votes, to social media influencers boosting engagement. These settings are inherently competitive, with sellers, candidates, and influencers vying for audience approval, yet it remains poorly understood how competitive feedback loops influence LLM behavior. We show that optimizing LLMs for competitive success can inadvertently drive misalignment. Using simulated environments across these scenarios, we find that, 6.3% increase in sales is accompanied by a 14.0% rise in deceptive marketing; in elections, a 4.9% gain in vote share coincides with 22.3% more disinformation and 12.5% more populist rhetoric; and on social media, a 7.5% engagement boost comes with 188.6% more disinformation and a 16.3% increase in promotion of harmful behaviors. We call this phenomenon Moloch's Bargain for AI--competitive success achieved at the cost of alignment. These misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded, revealing the fragility of current alignment safeguards. Our findings highlight how market-driven optimization pressures can systematically erode alignment, creating a race to the bottom, and suggest that safe deployment of AI systems will require stronger governance and carefully designed incentives to prevent competitive dynamics from undermining societal trust. There are clear economic and social incentives to optimize LLMs and AI agents for competitive markets: A company can increase its profits by generating more persuasive sales pitches, a candidate can capture a larger share of voters with sharper campaign messaging, and an influencer can boost engagement by producing more compelling social media content. In the presence of both the technology and the incentives, it is natural to expect adoption to move rapidly in this direction. In contrast, the incentives to ensure safety are far weaker. The costs of social hazards--such as deceptive product representation and disinformation on social media--are typically borne by the public rather than the organizations deploying these systems, who may be held accountable only when found legally liable. In this paper, we investigate the critical question: Can optimization for market success inadvertently produce misaligned LLMs? We experimentally show that misalignment consistently emerges from market competition across three different settings.


Large Language Models for Oral History Understanding with Text Classification and Sentiment Analysis

Cherukuri, Komala Subramanyam, Moses, Pranav Abishai, Sakata, Aisa, Chen, Jiangping, Chen, Haihua

arXiv.org Artificial Intelligence

Oral histories are vital records of lived experience, particularly within communities affected by systemic injustice and historical erasure. Effective and efficient analysis of their oral history archives can promote access and understanding of the oral histories. However, Large-scale analysis of these archives remains limited due to their unstructured format, emotional complexity, and high annotation costs. This paper presents a scalable framework to automate semantic and sentiment annotation for Japanese American Incarceration Oral History. Using LLMs, we construct a high-quality dataset, evaluate multiple models, and test prompt engineering strategies in historically sensitive contexts. Our multiphase approach combines expert annotation, prompt design, and LLM evaluation with ChatGPT, Llama, and Qwen. We labeled 558 sentences from 15 narrators for sentiment and semantic classification, then evaluated zero-shot, few-shot, and RAG strategies. For semantic classification, ChatGPT achieved the highest F1 score (88.71%), followed by Llama (84.99%) and Qwen (83.72%). For sentiment analysis, Llama slightly outperformed Qwen (82.66%) and ChatGPT (82.29%), with all models showing comparable results. The best prompt configurations were used to annotate 92,191 sentences from 1,002 interviews in the JAIOH collection. Our findings show that LLMs can effectively perform semantic and sentiment annotation across large oral history collections when guided by well-designed prompts. This study provides a reusable annotation pipeline and practical guidance for applying LLMs in culturally sensitive archival analysis. By bridging archival ethics with scalable NLP techniques, this work lays the groundwork for responsible use of artificial intelligence in digital humanities and preservation of collective memory. GitHub: https://github.com/kc6699c/LLM4OralHistoryAnalysis.


NewsEdits 2.0: Learning the Intentions Behind Updating News

Spangher, Alexander, Huang, Kung-Hsiang, Cho, Hyundong, May, Jonathan

arXiv.org Artificial Intelligence

As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts. In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can predict which facts in a news article will update using solely the text of a news article (i.e. not external resources like search engines). We test this hypothesis, first, by isolating fact-updates in large news revisions corpora. News articles may update for many reasons (e.g. factual, stylistic, narrative). We introduce the NewsEdits 2.0 taxonomy, an edit-intentions schema that separates fact updates from stylistic and narrative updates in news writing. We annotate over 9,200 pairs of sentence revisions and train high-scoring ensemble models to apply this schema. Then, taking a large dataset of silver-labeled pairs, we show that we can predict when facts will update in older article drafts with high precision. Finally, to demonstrate the usefulness of these findings, we construct a language model question asking (LLM-QA) abstention task. We wish the LLM to abstain from answering questions when information is likely to become outdated. Using our predictions, we show, LLM absention reaches near oracle levels of accuracy.


LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification

Liu, Chun, Zhang, Hongguang, Zhao, Kainan, Ju, Xinghai, Yang, Lin

arXiv.org Artificial Intelligence

With the booming of Large Language Models (LLMs), prompt-learning has become a promising method mainly researched in various research areas. Recently, many attempts based on prompt-learning have been made to improve the performance of text classification. However, most of these methods are based on heuristic Chain-of-Thought (CoT), and tend to be more complex but less efficient. In this paper, we rethink the LLM-based text classification methodology, propose a simple and effective transfer learning strategy, namely LLMEmbed, to address this classical but challenging task. To illustrate, we first study how to properly extract and fuse the text embeddings via various lightweight LLMs at different network depths to improve their robustness and discrimination, then adapt such embeddings to train the classifier. We perform extensive experiments on publicly available datasets, and the results show that LLMEmbed achieves strong performance while enjoys low training overhead using lightweight LLM backbones compared to recent methods based on larger LLMs, i.e. GPT-3, and sophisticated prompt-based strategies. Our LLMEmbed achieves adequate accuracy on publicly available benchmarks without any fine-tuning while merely use 4% model parameters, 1.8% electricity consumption and 1.5% runtime compared to its counterparts. Code is available at: https://github.com/ChunLiu-cs/LLMEmbed-ACL2024.


From Sparse to Soft Mixtures of Experts

Puigcerver, Joan, Riquelme, Carlos, Mustafa, Basil, Houlsby, Neil

arXiv.org Artificial Intelligence

Despite their success, MoEs suffer from a number of issues: training instability, token dropping, inability to scale the number of experts, or ineffective finetuning. In this work, we propose Soft MoE, a fully-differentiable sparse Transformer that addresses these challenges, while maintaining the benefits of MoEs. Soft MoE performs an implicit soft assignment by passing different weighted combinations of all input tokens to each expert. As in other MoE works, experts in Soft MoE only process a subset of the (combined) tokens, enabling larger model capacity at lower inference cost. In the context of visual recognition, Soft MoE greatly outperforms standard Transformers (ViTs) and popular MoE variants (Tokens Choice and Experts Choice). For example, Soft MoE-Base/16 requires 10.5 lower inference cost (5.7 lower wall-clock time) than ViT-Huge/14 while matching its performance after similar training. Soft MoE also scales well: Soft MoE Huge/14 with 128 experts in 16 MoE layers has over 40 more parameters than ViT Huge/14, while inference time cost grows by only 2%, and it performs substantially better.


Generative AI-empowered Simulation for Autonomous Driving in Vehicular Mixed Reality Metaverses

Xu, Minrui, Niyato, Dusit, Chen, Junlong, Zhang, Hongliang, Kang, Jiawen, Xiong, Zehui, Mao, Shiwen, Han, Zhu

arXiv.org Artificial Intelligence

In the vehicular mixed reality (MR) Metaverse, the distance between physical and virtual entities can be overcome by fusing the physical and virtual environments with multi-dimensional communications in autonomous driving systems. Assisted by digital twin (DT) technologies, connected autonomous vehicles (AVs), roadside units (RSU), and virtual simulators can maintain the vehicular MR Metaverse via digital simulations for sharing data and making driving decisions collaboratively. However, large-scale traffic and driving simulation via realistic data collection and fusion from the physical world for online prediction and offline training in autonomous driving systems are difficult and costly. In this paper, we propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations for improving driving safety and traffic efficiency. First, we propose a multi-task DT offloading model for the reliable execution of heterogeneous DT tasks with different requirements at RSUs. Then, based on the preferences of AV's DTs and collected realistic data, virtual simulators can synthesize unlimited conditioned driving and traffic datasets to further improve robustness. Finally, we propose a multi-task enhanced auction-based mechanism to provide fine-grained incentives for RSUs in providing resources for autonomous driving. The property analysis and experimental results demonstrate that the proposed mechanism and architecture are strategy-proof and effective, respectively.


Pakistan police, kin seek murder charge over driver killed along with Taliban chief in U.S. drone strike

The Japan Times

QUETTA, PAKISTAN – The family of a driver who was killed alongside Taliban chief Mullah Akhtar Mansour in a U.S. drone strike in Pakistan has filed a case against U.S. officials, seeking to press murder charges, police said Sunday. Mansour had entered Pakistan from Iran using a false name and fake Pakistani identity documents on May 21, when his car was targeted by a U.S. drone. The driver, who was also killed, was later identified as Mohammed Azam. The police filed a case on behalf of Azam's family, police official Abdul Wakil Mengal said. It was not immediately clear what legal avenues the family can realistically pursue.


Senior Taliban figure says death of leader could unify group

U.S. News

A Pakistani police officer and paramedics stand beside two dead bodies reportedly killed in a U.S. drone strike in the Ahmad Wal area in Baluchistan province, Pakistan, at a hopsital in Quetta, Pakistan, Sunday, May 22, 2016. A senior commander of the Afghan Taliban confirmed on Sunday that the extremist group's leader, Mullah Mohammad Akhtar Mansour, had been killed in the strike.